Second International Workshop on Utility-Based Data Mining Workshop Chairs

نویسندگان

  • Bianca Zadrozny
  • Gary Weiss
  • Maytal Saar-Tsechansky
چکیده

Researchers often use clinical trials to collect the data needed to evaluate some hypothesis, or produce a classifier. During this process, they have to pay the cost of performing each test. Many studies will run a comprehensive battery of tests on each subject, for as many subjects as their budget will allow – i.e., “round robin” (RR). We consider a more general model, where the researcher can sequentially decide which single test to perform on which specific individual; again subject to spending only the available funds. Our goal here is to use these funds most effectively, to collect the data that allows us to learn the most accurate classifier. We first explore the simplified “coins version” of this task. After observing that this is NP-hard, we consider a range of heuristic algorithms, both standard and novel, and observe that our “biased robin” approach is both efficient and much more effective than most other approaches, including the standard RR approach. We then apply these ideas to learning a naive-bayes classifier, and see similar behavior. Finally, we consider the most realistic model, where both the researcher gathering data to build the classifier, and the user (e.g., physician) applying this classifier to an instance (patient) must pay for the features used --e.g., the researcher has $10,000 to acquire the feature values needed to produce an optimal $30/patient classifier. Again, we see that our novel approaches are almost always much more effective that the standard RR model. This is joint work with Aloak Kapoor, Dan Lizotte and Omid Madani. See the Budgeted Learning Webpage at http://www.cs.ualberta.ca/~greiner/BudgetedLearning. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. UBDM’06, August 20, 2006, Philadelphia, Pennsylvania, USA. Copyright 2006 ACM 1-59593-440-5/06/0008...$5.00.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

First International Workshop on Utility-Based Data Mining Workshop Chairs:

Data mining requires certain information—for example, supervised learning requires training data. Some prior research has recognized that this information often does not simply present itself for free, but involves various acquisition costs. In addition, applying the learned models involves costs and benefits. I introduce a general economic setting that includes as special cases the settings of...

متن کامل

New perspectives on Causal Networks: the first CaNew workshop

We are pleased to introduce a selection of the papers presented at the 1998 workshop on `Causal Networks from Inference to Data Mining', CaNew '98, [59]. This workshop was initiated from the feeling, shared by the organizers and co-chairs, that the ®eld of Bayesian and, in general, Causal Networks deserved special attention from the international research community. We had a growing feeling tha...

متن کامل

Proceedings of the Second International Workshop on Knowledge Discovery from Sensor Data ( Sensor - KDD 2008 ) August 24 , 2008 , Las Vegas , Nevada , USA held in conjunction with SIGKDD ' 08 Workshop Chairs

The detection of outliers from spatio-temporal data is an important task due to the increasing amount of spatio-temporal data available and the need to understand and interpret it. Due to the limitations of current data mining techniques, new techniques to handle this data need to be developed. We propose a spatio-temporal outlier detection algorithm called Outstretch, which discovers the outli...

متن کامل

Current and Future Challenges in Mining Large Networks: Report on the Second SDM Workshop on Mining Networks and Graphs

We report on the Second Workshop on Mining Networks and Graphs held at the 2015 SIAM International Conference on Data Mining. This half-day workshop consisted of a keynote talk, four technical paper presentations, one demonstration, and a panel on future challenges in mining large networks. We summarize the main highlights of the workshop, including expanded written summaries of the future chal...

متن کامل

6th IEEE International Workshop on PervasivE Learning PerEL 2010: Message from the workshop chairs

The 6th IEEE International Workshop on PervasivE Learning (PerEL 2010) continues a successful series started five years ago at IEEE PerCom. The workshop aims to address issues of pervasive computing in combination with new types of learning, teaching and working. Based on first results and practical experiences regarding tools and applications presented at previous PerEL workshops it focuses th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006